Adjustable Byte Lane Offset For Memory Module to Reduce Skew

ABSTRACT

Disclosed herein are solutions for addressing the problem of skew of data within a byte lane by factors caused external to the integrated circuit or module providing the data. To compensate for such skew, an on-chip delay is added to the data out paths of those bits in the byte lane with otherwise would arrive early to their destinations. Such on-chip delay is provided delay circuits preferably positioned directly before the output buffers/bond pads of the integrated circuit or module. By intentionally delaying some of the outputs from the integrated circuit or module, external skew is compensated for so that all data in the byte lane arrives at the destination at substantially the same time. In a preferred embodiment, the delay circuits are programmable to allow the integrated circuit or module to be freely tailored to environments having different skew considerations, such as different styles of connectors.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a continuation of U.S. patent application Ser. No. 12/265,265,filed Nov. 5, 2008, which was a continuation of U.S. patent applicationSer. No. 11/124,744, filed May 9, 2005 (now U.S. Pat. No. 7,457,978).Priority is claimed to both of these applications, and both areincorporated herein by reference in its entirety.

FIELD OF THE INVENTION

Embodiments of this invention relate to improving the skew in a bytelane in a memory module.

BACKGROUND

Memory modules (e.g., Single In-Line Memory Modules (SIMMs)), DualIn-Line Memory Modules (DIMMs), and Small Outline DIMMs (SODIMMs)) arecommon in the computer industry, and generally comprise a printedcircuit board (PCB) having a number of memory chips thereon. Such memorychips are usually DRAM memory chips, and more typically synchronousDRAMs (e.g., DDRx DRAMs). By incorporating several memory chips on asingle PCB, the modules can hold large amounts of data, and thus areuseful in computing applications. Generally, data is retrieved from themodule by a call from some master device that needs access to the data,e.g., a microprocessor, which typically calls for eight bits of data(i.e., a “byte”) at one time.

A memory module 10 (shown in isolation in FIG. 1A) typically mounts to asystem (such as mother board 12) by way of a connector 18, as shown incross-section in FIG. 1B. In this particular example, the module 10 is aSODIMM module. SODIMM modules are useful in applications such asnotebook computers because of their low profiles. This low profile isfacilitated by the use of a 90-degree connector 18, which allows themodule 10 to be positioned parallel to the mother board 12 when mountedin the connector 18.

The particular memory module 10 illustrated has memory chips 16 on thetop (16 t) and bottom (16 b) of a PCB 14. As one skilled in the art willunderstand, the PCB 14 further contains contacts 20 at one edge of thePCB 14. These contacts 20 connect to pins on the memory chips 16 t and16 b (not shown) via circuit traces in the PCB 14 (not shown). Asillustrated, the contacts 20, like the memory chips, appear on the top(20 t) and bottom (20 b) of the PCB 14. Typically, such contacts aretinned or gold plated to ensure good electrical connection with theconnector 18 as discussed further below.

When the memory module 10 is positioned within the connector 18 (e.g.,by press fit, by the use of latches, or by other means in the art), asshown in FIG. 1B, the contacts 20 further connect to conductors 22molded inside of the plastic connector body 18. These conductors 22 arein turn connected to traces on the mother board 12 (not shown) andultimately to other electrical components on the mother board 12, suchas a microprocessor (not shown). Because the conductors 22 communicatewith both the top 20 t and bottom 20 b contacts, the conductors 22within the connector 18 body will also be split into top (22 t) andbottom (22 b) conductors.

When the memory module 10 is so coupled to the mother board 12, it willbe noticed that the electrical pathway between the contacts 20 and themotherboard 12 differs depending on whether top or bottom contacts areconsidered. This is because, by necessity, conductors 22 t are longerthan conductors 22 b, e.g., by approximately 10 millimeters. As aresult, the signals passing from the chips 16 through the top contacts20 t and top conductors 22 t will arrive at the mother board 12 slightlydelayed with respect to similar signals passing through the bottomcontacts 20 b and bottom conductors 22 b.

This difference in length has a small, but potentially critical, effecton the timing of the signals that pass through the conductors 22. Forexample, suppose a microprocessor on the mother board 12 calls to thememory module to provide a byte of data (from outputs DQ0-DQ7). Thesesignals (e.g., in a DDRx DRAM module) appear on opposite sides of thememory module 10, as shown in FIG. 1C. Specifically, the first fourbits, DQ0-DQ3, or “nibble” of data corresponding to pins 5, 7, 15 and 17on the module, are output on the bottom contacts 20 b of the module. Theother nibble, DQ4-DQ7, corresponding to pins 4, 6, 16, and 18 of themodule 10, are output on the top contacts 20 t of the module. (Althougha typical DDRx DRAM module would have many dozens of pins, only a feware shown in FIG. 1C).

However, data from these module outputs will typically be called for atthe same time, i.e., on a byte basis. When the microprocessor makes sucha call, the length difference inside the connector will cause the datacorresponding to the nibble DQ0-DQ3 to arrive at the mother board 12slightly before nibble DQ4-DQ7, e.g., perhaps on the order of 50picoseconds or so. That is to say, a 50 ps “skew” is introduced in thebyte lane. While this delay is relatively small, it can represent asignificant portion of the data valid window on a memory modulecontaining high speed memory chips (e.g., 20% of the data valid windowon a DDR3 DRAM module).

To put this problem into further perspective, FIG. 2 shows the timing ofthe signals comprising the byte lane as they reach the mother board 12.The data is accompanied by a data valid signal, DQS, which is also sentby the module 10 when the byte is called for. Essentially, DQSrepresents a signal which indicates to the calling entity, e.g., themicroprocessor, when the data called for is valid. The DQS signal, as tothis particular byte, is also provided on the top contact 20 t of themodule 10. As is shown, the DQS signal arrives at the motherboard whennibble DQ4-DQ7 also arrives, as they are all provided through the topcontacts 20 t of the module and the top conductors 22 t of the connector18. However, nibble DQ0-DQ3, outputs to the bottom contacts 20 b of themodule 10, and thus arrives earlier by virtue of its shorter paththrough conductors 22 b in the connector 18. The result of this skew isthat the DQS signal doesn't exactly accurately indicate to themicroprocessor when valid data is necessarily present for the entiretyof the byte lane.

This problem has been rectified in the prior art by adjusting thelengths of the electrical traces on the mother board. Specifically, thelength of the traces between the connector 18 and, for example, themicroprocessor on the mother board 12 were lengthened for the “earlier”nibble, DQ0-DQ3 in the present example. In other words, the mother boardtraces for the earlier nibble would be longer than those for the laternibble, DQ4-DQ7. In so doing, and assuming the increase trace lengthcompensates for the timing differential caused by the connectorconductors 22 t and 22 b, the signals will be provided to themicroprocessor at the same time, overcoming this problem.

However, this prior art solution is not optimal. First, it requires themother board design to account for delays caused by the connector 18 andto specifically engineer the trace lengths. This may be inconvenient.Moreover, an otherwise undesired diversion in the trace length (such asa serpentine) is required, and may not be possible if space does notpermit on the mother board.

Second, such lengthening of trace lengths essentially tailors the motherboard for a particular connector, rendering the motherboard non-optimalif other types of connectors are to be used. For example, consider the0-degree connector 18 of FIG. 3. This connectors orients the module 10perpendicularly to the mother board 12 when mounted, as would be typicalin a desktop computer. Moreover, given this configuration, it can beseen that the conductors 22 inside of the connector 18 are of the samelength. In other words, the 0-degree connector 18 of FIG. 3 does notcause the same skew problem between nibbles in the byte lane as does the90-degree connector 18 of FIG. 1B. Therefore, if the trace lengths onthe mother board 12 are optimized for a particular type of connector,the use of other connectors would be non-optimal. Hence, adjustment oftrace lengths does not make for a universal solution.

SUMMARY

Disclosed herein are solutions for addressing the problem of skew ofdata within a byte lane by factors caused external to the integratedcircuit or module providing the data. One such external factor cancomprise the use of a connector with internal conductors of differentlengths that adds skew to the integrated circuit's or module's bytelane, which otherwise is called and desired to be provided synchronouslyin parallel to its destination (e.g., a mother board or microprocessor).To compensate for such skew, an on-chip delay is added to the data outpaths of those bits in the byte lane which otherwise would arrive earlyto their destinations. Such on-chip delay is provided delay circuitspreferably positioned directly before the output buffers/bond pads ofthe integrated circuit or the integrated circuits on the module. Byintentionally delaying some of the outputs from the integrated circuitor module, external skew is compensated for so that all data in the bytelane arrives at the destination at substantially the same time. In apreferred embodiment, the delay circuits are programmable to allow theintegrated circuit or module to be freely tailored to environmentshaving different skew considerations, such as different styles ofconnectors.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the inventive aspects of this disclosure will be bestunderstood with reference to the following detailed description, whenread in conjunction with the accompanying drawings, in which:

FIG. 1A illustrates a perspective view of a prior art memory module.

FIG. 1B illustrates a cross-section of the module of FIG. 1A mounted toa mother board by a 90-degree connector.

FIG. 1C illustrates a table showing the pin outs on the module of FIG.1A for exemplary bits in a byte lane.

FIG. 2 illustrates timing signals for bits in a byte lane to show theproblem of skew within the byte lane, and resulting skew with a datavalid signal.

FIG. 3 illustrates the module of FIG. 1A mounted to a mother board by a0-degree connector.

FIG. 4A and 4B illustrate embodiment in delay circuits are selectivelyprovided in the data output paths to provide a compensating skew in abyte lane of data.

FIG. 4C illustrates an embodiment similar to FIG. 4A in whichprogrammable delay circuits are used.

FIG. 5A illustrates a delay circuit useable in the context of FIGS. 4Aand 4B.

FIGS. 5B-5E illustrates programmable delay circuits useable in thecontext of FIG. 4C.

DETAILED DESCRIPTION

The problem of skew of data within the byte lane of a memory module issolved by introducing an on-chip delay to certain output signals withinthe memory chips themselves. This on-chip delay is designed, in oneembodiment, to compensate for skew that would otherwise be provided theconnector (e.g., a 90-degree connector) used to connect the module tothe mother board.

However, before the specific solution to this problem is addressed, itshould be realized that the on-chip delay technique disclosed herein canbe used in broader manners and different contexts. For example, theon-chip delay technique can used to compensate for skews appearing onany parallel stream of data, and is not limited to addressing skewwithin a byte lane per se. The on-chip delay technique can also be usedto compensate for skews related to the use of external factors otherthan connectors, including skews introduced by other devices external tothe memory chips. The technique can also be used with respect to skewsinternal to the chips themselves. While particularly useful in thecontext of memory chips, and more specifically memory modules, thetechniques are adaptable to other technologies as well, such asmicroprocessor and multi-chip modules more generally. The technique hasfurther pertinence to individual integrated circuits not comprisingportions of a module. In short, the on-chip delay technique provides abroad solution to many different potential problems of skew. Moreover,such on-chip delays can be made programmable, as explained furtherbelow.

In one embodiment, assume the specific problem of byte lane skew in amemory module introduced by the use of a 90-degree connector 18, such asshown in FIG. 1B. The solution to this specific problem does not requirea programmable on-chip delay, as the delay associated with thisparticular connector 18 would be constant. Accordingly, anon-programmable delay can be used, with the goal of providing anapproximately 50 ps delay to those nibbles in the byte lane (called the“earlier nibble”) which would otherwise arrive at the mother boardbefore the other nibble (the “later nibble”).

This is illustrated in one embodiment in FIG. 4A. Shown are eight memorychips 50 ₀-50 ₇ together representing a byte lane of data (DQ0-DQ7) fromthe memory module PCB on which the chips 50 _(x) are mounted (not shownfor clarity). Depicted for each chip 50 _(x) is the data out path, whichcomprises an internal data out signal 52, a standard output buffer 56, abond pad 58, and a bond wire 60 that ultimately connects to the leadframe of the package in which the chips 50 _(x) are positioned (notshown), and ultimately to the various contacts 20 (FIGS. 1A-B) on themodule 10.

As depicted, it is assumed that each of the eight DQ signals on themodule 10 is derived from one bit of each of the eight memory chips 50_(x). That is to say, it is assume that x1 DRAMs are used. Of course,this need not be the case, and the various data paths comprising themodule's byte lane may be integrated on one or more memory chips. Forexample, each of the nibbles may be provided by four separate outputs ontwo different memory chips 55 _(x) (shown in dotted lines). Moretypically, in current-day DDR DRAMs modules, the eight data pathscomprising a particular byte lane are all intergraded on one DRAM memorychip 57, as shown in FIG. 4B. In other words, typically x8 or x16 DRAMsare used, capable of outputting a byte or word of data in parallel.(Moreover, FIG. 4B illustrates the utility of the disclosed techniqueeven when a call for data is made to a single integrated circuit).

In any event, as to the earlier nibble in the byte lane (DQ0-DQ3 in theexample discussed earlier), a delay circuit 54 has been introducedbetween the data out signal 52 and the output buffer 56 so as to delaythe signals on those data paths by an appropriate time (i.e., t=50 ps).The delay circuit 54 can be placed anywhere along the data path,including earlier “up stream” in the chip. Note that this delay circuit54 does not appear in the later nibble in FIGS. 4A or 4B. Accordingly,when the microprocessor for example calls for data from the byte lane,the earlier nibble will be delayed by the delay circuits 54 by 50 ps,just as the later nibble will be delayed by virtue of the additionallength of the top conductors 22 t (FIG. 1B) in the connector. The resultis that the byte will arrive at the mother board with no or reducedskew. This allows the data valid window to be set around the arrival ofdata at the microprocessor with greater accuracy and buffer, without theneed to lengthen any circuit traces on the mother board. Although notshown, the data valid signal, DQS, may also be delayed if necessary,more specifically if it were provided on the bottom side of the module,although this is unnecessary in the example discussed earlier (see FIG.2).

FIG. 5A shows a simple way in which the on-chip delay circuit 54 can befabricated. As shown, delay is introduced simply by providing a seriesof inverters 62. Each inverter 62 provides some amount of delay to theinternal data out signal 52. Typically, this delay for a CMOS inverteris on the order of 10-20 ps for current CMOS technologies, and can beeasily scaled by adjusting the gate lengths and widths of the NMOS andPMOS transistors which make up the inverter. (More specifically, and asone skilled in the art understands, the delay time of a CMOS invertercan be approximated by Δt=C*ΔVdd/I, where C is the capacitance of theload, Vdd is the power supply voltage, and I is the drive current. I canbe adjusted by adjusting the width or length of the transistor).Therefore, assuming the transistors in the inverters 62 are properlyscaled, a delay of approximately 15 ps (for example) can be achieved foreach, with four in series providing a delay of approximately 60 ps,acceptably close to the 50 ps skew introduced by the 90-degree connector18 (FIG. 1B). (An even number of inverters would be preferred topreserve the polarity of the internal data out signal 52).

Even further preferable to the delay circuit 54 of FIG. 5A are delaycircuits that are programmable. As applied to the byte lane data skewproblem discussed above, programmability is desirable to provide greaterflexibility in the type of connectors 18 (FIG. 1B) with which themodules can be used. As noted in the Background section, a given module10 can be used with varying types of connectors 18, such as 90-degree(FIG. 1B) or 0-degree connectors (FIG. 3), and thus a fixed delaycircuit 54 would not be optimal were it desired to use the module witheither of these types of connector.

FIG. 4C illustrates the use of programmable delay circuits 54. In thisexample, and compared to FIG. 4A, it will be seen that each outputcomprising a bit in the byte lane has a delay circuit 54 in its data outpath. However, no delay is set by the delay circuits in the later nibble(t=0), whereas the delay circuits 54 in the earlier nibble are onceagain set to approximately t=50 ps. Thus, the effect is the same as thatshown in FIGS. 4A and 4B, except that now that chips 50 _(x), 55 _(x),or 57 can be made uniformly, and then later programmed to address theunique problems of skew present in the byte lane.

FIG. 5B illustrates a programmable delay circuit 54. Again, inverters 62are used as the basic delay element, with antifuses (AF) 64 spanningevery two inverters 62. In their unprogrammed state, the antifuses 64act as open circuits, and hence a delay of eight inverter stages (e.g.,120 ps) would be introduced if none of the antifuses 64 are programmed;six stages if one of the antifuses are programmed; four stages if two ofthe antifuses are programmed; two stages if three of the antifuses areprogrammed; and no delay if all of the antifuses are programmed. FIG. 5Cachieves this same programming ability, with the need to only programone antifuse to effect a delay between zero to eight inverter stages. Asantifuses and methods for programming them are well known in thesemiconductor art, the circuitry used to do so is not shown.

Fuses could also be used, as illustrated in FIGS. 5D and 5E. In theirunprogrammed state, the fuses 66 act as short circuits, and hence nodelay would be introduced if none of the fuses 66 are programmed in FIG.5D; two inverter stages of delay if one of the fuses are programmed;four stages if two of the fuses are programmed; six stages if three ofthe fuses are programmed; and eight stages if all of the fuses areprogrammed. FIG. 5E achieves this same programming ability, with theneed to only program one fuse to effect a delay between zero to eightinverter stages. As fuses and methods for programming them are wellknown in the semiconductor art, the circuitry used to do so is notshown. The fuses 66 may be either programmable by signal (i.e., by theapplication of a voltage across the fuse), or by light (e.g., by laserablation).

Such one-time programmable approaches are destructive. Once programmed,the chips 50 (and the modules in which they reside), are permanentlytailored for a particular operating environment and/or connector.Therefore, an even further preferable approach to the delay circuit 54are many-times programmable circuits whose delay can be readily changed.This would allow a module, for example, to be freely tailored for use inany operating environment at any time, even if previously programmed fora certain operating environment. One simple way of doing so, notillustrated in the Figures, would be to substitute an ErasableProgrammable Read Only Memory (EPROM) cell for either the antifuses 64of fuses 66 of FIGS. 5B-5E. Such a cell could be UV erasable usingradiation or electrically erasable via application of an erase voltage.Again, such techniques are well known in the art.

Programming of the delay circuits on the memory chips can take placeusing an on chip mode register. As one skilled in the art understands, amode register contains various settings used to tailor the operation ofthe chip. The mode register can be programmed using special test modes,usually by activating otherwise standard control signals on the chips orthe module in unique sequences. Using such a standard technique, thedelay value for each of the delay circuit can be easily programmed. Suchprogramming could occur at the chip level (i.e., before the chips aremounted to a PCB), or at the board level (i.e., after mounting to themodule PCB).

Of course, the use of serially-connected inverters 62 is only one way ofcreating a delayed signal on the earlier nibble. One skilled in the artwill recognize that many different types of fixed, one-timeprogrammable, or freely programmable delay circuits can be used toachieve the goal of delaying the internal data out signal 52. Forexample, other logic gates can be used, varying capacitances can beprovided to achieve a desired granularity in the delay on the internaldata out signal, etc.

As used herein, a “mother board” need be only another board forcommunicating with the memory module. While such a board would typicallycontain a system microprocessor were the memory module to be used in atraditional computer configuration, this is not strictly necessary. Anyboard capable of calling the memory module could comprise the motherboard, regardless of its configuration and function.

It should be understood that the inventive concepts disclosed herein arecapable of many modifications. To the extent such modifications fallwithin the scope of the appended claims and their equivalents, they areintended to be covered by this patent.

What is claimed is:
 1. A memory module, comprising: a plurality of datapaths internal to the memory module and configured to provide datasignals in parallel; and one or more connectors to receive the datasignals from the plurality of data paths, the one or more connectorsbeing couplable to a receiving circuit via a plurality of tracesexternal to the memory module, at least one of the one or moreconnectors having a delay circuit to selectively impose one or moredifferent delays on data signals in its respective data path at arespective output of the one or more connectors to provide acompensating skew to the respective data signal.